Russian version

Reading on PDA

You may be surprised to see yet another article on reading on PDA, but I have some reasons to write it.

There are some special devices for reading, such as Sony Read which may be better suited for such task, but PDAs have some important features, such as small size and price, great variety of reading programs and converters. So there are a lot of people who buy PDAs only for reading. There are two major flavors of PDAs: Palm OS-powered and Windows CE/Pocket PC/Windows Mobile-powered devices.

I have old Palm OS 4 powered device Sony Clie SJ20, and it is very convenient for reading.

So if you have PDA or modern smartphone with big screen, you possibly know some ways to read books on it. Nevertheless I will remind you some basics.

Basic reading

If you want to read ebook which you have on your PC (or Mac) in some convenient format, such as plain text, html or MS-Word, you can transfer it on your PDA and read it with your favorite software, such as Documents To Go on Palm or Pocket Word/Pocket IE on Windows.

Also you can convert such a book into more compact and convenient format and read it with some special reading program. I prefer to use iSilo, because it supports documents with complex formatting and images, and has very handy desktop converter iSiloX which works under Windows and Linux (using wine), and native command-line converter for Linux and Mac.

Many Pocket PC users prefer HaaliReader.

This methods work really well for simple electronic documents, so let’s move to more advanced features.

Reading Chm

This format is rather widespread for ebooks. Pocket PC users shouldn't have any problems with this format.

Chm is a compiled html. So you need to extract it to convert to iSilo. You can do it on Windows with command

hh.exe -decompile Destination_Directory filename.chm

I used to use KeyTools - free small set of graphical tools for working with chm, which work well under wine too.

Then you can convert html to iSilo.

Reading PDF

It is not easy task in general.

There are several types of PDF documents. Text only PDFs could be easily converted to text, and then read on PDA. You can select all the text and insert it in your favorite text editor/word processor. Some programs for reading on PDA support such PDF documents. Also you can convert PDF to html with Adobe Acrobat or pdftohtml to preserve some formatting and the to iSilo. This method doesn’t work in general, because PDF documents often include non-standard fonts. This is generally a problem for scientific books or papers, because all formulas are lost during conversion.

Second kind of PDF documents is documents which consist of images, such as scanned books. These documents can not be easily converted to PDA because images don’t fit into small screen of PDA or smartphone.

But I can give you some ideas how to face this problem, so keep on reading.

Reading scanned books

It is usually possible to OCR such a books with general purpose OCR programs, such as FineReader and then convert text to iSilo or your favorite reading format, but this programs are helpless if book contains formulas.

So I’ll give some advices on

Reading scientific papers

There are a lot of scientific papers and books on the Web.

For example, most modern articles on physics and mathematics are available for free from arxiv. You can also have access to electronic journals through systems such as Metapress, ScienceDirect, Scopus.

In most cases articles are provided in PDF or PostScript format, but in arxiv you can also download sources of articles usually in LaTeX format. It is really helpful, because LaTeX can be easily converted to html with TeX4ht.

Converting LaTeX with TeX4ht

To use TeX4ht you need to install:

  • LaTex, you can use miktex distribution for Windows, or you can install LaTeX from you package repository if you use Linux
  • TeX4ht, it is included in modern versions of miktex, and also available as package in Linux distributives
  • ImageMagick to convert formulas into images, it is available in most Linux distributives

If you use Widows, you can have some troubles installing TeX4ht, so this document may be helpful.

After you have installed everything, you can convert your LaTeX document (article from arxiv, for example) using following command:

  htlatex your_tex_file_name "html,pic-m"

“html,pic-m” tells TeX4ht to convert file into html and make images from formulas. You can obtain additional information about TeX4ht options from manual.

Now you can convert html to iSilo ar read it with Pocket IE.

It’s a pity that LaTeX source of articles and books is not always available. But there are some ways to cope with it.

Recognizing text with formulas with InftyReader

If you have good quality PDF document or scanned book, you can try to recognize it using InftyReader. This program can recognize formulas and English text. You can download trial version and use it to produce TeX. Then you can convert it using TeX4ht.

The problem here is that your document should be of really good quality. Also there is no support for other languages.

Reading scanned scientific books

Many scientific books and old articles are available only in scanned form, I mean as images, usually packed into PDF or DJVU. For example, a lot of Russian and English scientific books can be downloaded from Kolhoz library or library of Mechanical and Mathematical Department of MSU (this libraries don’t give open access though). Scans of old articles are available from sites of publishers, such as Springer. These books are usually of rather poor quality for better download speed, and couldn’t be recognized with InftyReader.

Reading DJVU

There are several djvu readers for PocketPC/Windows Mobile , for example ExpressView from Lizardtech (creators of DJVU format). I can recommend two programs:

  • PocketDJVU - this program is free software, it is rather stable and fast, but if you read on smartphone or qVGA Pocket PC you will have to scroll page to read each line of text.
  • SmartDJVU - commercial program, created by Inscenic. This program have great feature - it can wrap words from djvu document using information in text layer. And it is quite easy to create text layer in DJVU document with DocumentExpress from Lizardtech. But SmartDJVU is rather unstable and now developers says that there will be no new versions because SmartDJVU is unprofitable. Anyway, it is really big step in correct direction.

So, there are quite good djvu readers for PocketPC, and you will have almost no problems with reading djvu books if you have Pocket PC with VGA screen.

But if you have Palm or smartphone or you can’t add text layer into your djvu document, you will need something else. Also there is no way to recognize handwritten text, such as lecture notes.

Converting scanned papers with Fit2PDA

You can use my program Fit2PDA to convert books or other scanned documents. The principle of it’s work is rather simple: Fit2PDA splits pages of scanned books into lines and split lines into parts so that they fit into PDA display. You can see how it works here. Fit2PDA will produce html with images having width of screen of your device.

You need to download Fit2PDA from program’s page program's page, run fit2pda.exe or fit2pda. Then you should add your books to program and choose parameters of conversion: width of screen of your device, desired resolution (dpi, 170-190 dpi for modern Palms or Pockets, 70-90 dpi for older Palms and Pockets, 96 or 130 dpi for smartphones). Also you can choose to chunk resulted html into chunks because some programs on some devices are unable to show documents with big number of big images. Then press convert and take a cup of coffee, because conversion will take rather long time.

Then you could convert this html to iSilo (don’t forget to uncheck “Resize large images” checkbox) or read it with Pocket IE.

Fit2PDA also may be useful to convert handwritten lecture notes, as you can see here on bottom picture, but you’d better to use some methods to improve image quality, especially contrast, and suppress squares.

You can also write me and read my blog if you want to know more about Fit2PDA.

Conclusion

I use all mentioned methods of converting documents and read a lot of scientific papers and books on my PDA. PDA may be not perfect for reading, but it is very flexible and convenient tool.

 
projects/reading/english.txt · Последние изменения: 2007/06/27 08:07 anton
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki